{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "369c3444",
   "metadata": {},
   "source": [
    "# ReadtheDocs Retrieval Augmented Generation (RAG) using Zilliz Free Tier"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f6ffd11a",
   "metadata": {},
   "source": [
    "In this notebook, we are going to use Milvus documentation pages to create a chatbot about our product.  The chatbot is going to follow RAG steps to retrieve chunks of data using Semantic Vector Search, then the Question + Context will be fed as a Prompt to a LLM to generate an answer.\n",
    "\n",
    "Many RAG demos use OpenAI for the Embedding Model and ChatGPT for the Generative AI model.  **In this notebook, we will demo a fully open source RAG stack.**\n",
    "\n",
    "Using open-source Q&A with retrieval saves money since we make free calls to our own data almost all the time - retrieval, evaluation, and development iterations.  We only make a paid call to OpenAI once for the final chat generation step. \n",
    "\n",
    "<div>\n",
    "<img src=\"../../images/rag_image.png\" width=\"80%\"/>\n",
    "</div>\n",
    "\n",
    "Let's get started!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "f2865a94",
   "metadata": {},
   "outputs": [],
   "source": [
    "# For colab install these libraries in this order:\n",
    "# !pip install pymilvus langchain\n",
    "# !pip install python-dotenv unstructured openai"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "d7570b2e",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Import common libraries.\n",
    "import sys, os, time, pprint\n",
    "import numpy as np\n",
    "\n",
    "# Import custom functions for splitting and search.\n",
    "sys.path.append(\"..\")  # Adds higher directory to python modules path.\n",
    "import milvus_utilities as _utils"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b98278d4",
   "metadata": {},
   "source": [
    "## Download Milvus documentation to a local directory.\n",
    "\n",
    "The data we’ll use is our own product documentation web pages.  ReadTheDocs is an open-source free software documentation hosting platform, where documentation is written with the Sphinx document generator.\n",
    "\n",
    "The code block below downloads the web pages into a local directory called `rtdocs`.  \n",
    "\n",
    "I've already uploaded the `rtdocs` data folder to github, so you should see it if you cloned my repo."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "a2a7ff9d",
   "metadata": {},
   "outputs": [],
   "source": [
    "# # Uncomment to download readthedocs pages locally.\n",
    "\n",
    "# DOCS_PAGE=\"https://pymilvus.readthedocs.io/en/latest/\"\n",
    "# !echo $DOCS_PAGE\n",
    "\n",
    "# # Specify encoding to handle non-unicode characters in documentation.\n",
    "# !wget -r -A.html -P rtdocs --header=\"Accept-Charset: UTF-8\" $DOCS_PAGE"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fb844837",
   "metadata": {},
   "source": [
    "## Start up a Zilliz free tier cluster.\n",
    "\n",
    "Code in this notebook uses fully-managed Milvus on [Ziliz Cloud free trial](https://cloud.zilliz.com/login).  \n",
    "  1. Choose the default \"Starter\" option when you provision > Create collection > Give it a name > Create cluster and collection.  \n",
    "  2. On the Cluster main page, copy your `API Key` and store it locally in a .env variable.  See note below how to do that.\n",
    "  3. Also on the Cluster main page, copy the `Public Endpoint URI`.\n",
    "\n",
    "💡 Note: To keep your tokens private, best practice is to use an **env variable**.  See [how to save api key in env variable](https://help.openai.com/en/articles/5112595-best-practices-for-api-key-safety). <br>\n",
    "\n",
    "👉🏼 In Jupyter, you need a .env file (in same dir as notebooks) containing lines like this:\n",
    "- ZILLIZ_API_KEY=f370c...\n",
    "- OPENAI_API_KEY=sk-H...\n",
    "- VARIABLE_NAME=value...\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "0806d2db",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Type of server: Zilliz Cloud Vector Database(Compatible with Milvus 2.3)\n"
     ]
    }
   ],
   "source": [
    "# STEP 1. CONNECT TO MILVUS\n",
    "\n",
    "# !pip install pymilvus #python sdk for milvus\n",
    "from pymilvus import connections, utility\n",
    "from dotenv import load_dotenv\n",
    "load_dotenv()\n",
    "TOKEN = os.getenv(\"ZILLIZ_API_KEY\")\n",
    "\n",
    "# Connect to Zilliz cloud using endpoint URI and API key TOKEN.\n",
    "# TODO change this.\n",
    "CLUSTER_ENDPOINT=\"https://in03-xxxx.api.gcp-us-west1.zillizcloud.com:443\"\n",
    "connections.connect(\n",
    "  alias='default',\n",
    "  #  Public endpoint obtained from Zilliz Cloud\n",
    "  uri=CLUSTER_ENDPOINT,\n",
    "  # API key or a colon-separated cluster username and password\n",
    "  token=TOKEN,\n",
    ")\n",
    "\n",
    "# Check if the server is ready and get colleciton name.\n",
    "print(f\"Type of server: {utility.get_server_version()}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "130058f8",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "model_name: text-embedding-3-small\n",
      "EMBEDDING_DIM: 1536\n"
     ]
    }
   ],
   "source": [
    "# STEP 2. EMBEDDING MODEL.\n",
    "\n",
    "import openai, pprint\n",
    "from openai import OpenAI\n",
    "\n",
    "# OpenAI embedding model name, `text-embedding-3-large` or `ext-embedding-3-small`.\n",
    "EMBEDDING_MODEL = \"text-embedding-3-small\"\n",
    "EMBEDDING_DIM = 512 # 512 or 1536 possible for 3-small\n",
    "\n",
    "# See how to save api key in env variable.\n",
    "# https://help.openai.com/en/articles/5112595-best-practices-for-api-key-safety\n",
    "openai_client = OpenAI(\n",
    "    # This is the default and can be omitted\n",
    "    api_key=os.environ.get(\"OPENAI_API_KEY\"),\n",
    ")\n",
    "\n",
    "# Get the model parameters and save for later.\n",
    "response = openai_client.embeddings.create(\n",
    "    input=\"Your text string goes here\",\n",
    "    model=EMBEDDING_MODEL\n",
    ")\n",
    "res_embedding = response.data[0].embedding\n",
    "# print(f'{res_embedding[:20]} ...')\n",
    "EMBEDDING_DIM = len(res_embedding)\n",
    "\n",
    "# Inspect model parameters.\n",
    "print(f\"model_name: {EMBEDDING_MODEL}\")\n",
    "print(f\"EMBEDDING_DIM: {EMBEDDING_DIM}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "a003cc65",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "model_name: text-embedding-3-small\n",
      "EMBEDDING_DIM: 512\n"
     ]
    }
   ],
   "source": [
    "# NOW TRY REDUCED DIMENSIONS!\n",
    "# https://openai.com/blog/new-embedding-models-and-api-updates\n",
    "# OpenAI embedding model name, `text-embedding-3-large` or `ext-embedding-3-small`.\n",
    "EMBEDDING_MODEL = \"text-embedding-3-small\"\n",
    "EMBEDDING_DIM = 512 # 512 or 1536 possible for 3-small\n",
    "\n",
    "# Get the model parameters and save for later.\n",
    "response = openai_client.embeddings.create(\n",
    "    input=\"Your text string goes here\",\n",
    "    model=EMBEDDING_MODEL,\n",
    "    dimensions=EMBEDDING_DIM\n",
    ")\n",
    "res_embedding = response.data[0].embedding\n",
    "# print(f'{res_embedding[:20]} ...')\n",
    "EMBEDDING_DIM = len(res_embedding)\n",
    "\n",
    "# Inspect model parameters.\n",
    "print(f\"model_name: {EMBEDDING_MODEL}\")\n",
    "print(f\"EMBEDDING_DIM: {EMBEDDING_DIM}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "af39879a",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Successfully created collection: `MilvusDocs_text_embedding_3_small`\n",
      "{'collection_name': 'MilvusDocs_text_embedding_3_small', 'auto_id': True, 'num_shards': 1, 'description': '', 'fields': [{'field_id': 100, 'name': 'id', 'description': '', 'type': 5, 'params': {}, 'element_type': 0, 'auto_id': True, 'is_primary': True}, {'field_id': 101, 'name': 'vector', 'description': '', 'type': 101, 'params': {'dim': 512}, 'element_type': 0}], 'aliases': [], 'collection_id': 447376431941003159, 'consistency_level': 3, 'properties': {}, 'num_partitions': 1, 'enable_dynamic_field': True}\n"
     ]
    }
   ],
   "source": [
    "# STEP 3. CREATE A NO-SCHEMA MILVUS COLLECTION AND DEFINE THE DATABASE INDEX.\n",
    "\n",
    "from pymilvus import MilvusClient\n",
    "\n",
    "# Set the Milvus collection name.\n",
    "COLLECTION_NAME = \"MilvusDocs_text_embedding_3_small\"\n",
    "\n",
    "# Add custom HNSW search index to the collection.\n",
    "# M = max number graph connections per layer. Large M = denser graph.\n",
    "# Choice of M: 4~64, larger M for larger data and larger embedding lengths.\n",
    "M = 16\n",
    "# efConstruction = num_candidate_nearest_neighbors per layer. \n",
    "# Use Rule of thumb: int. 8~512, efConstruction = M * 2.\n",
    "efConstruction = M * 2\n",
    "# Create the search index for local Milvus server.\n",
    "INDEX_PARAMS = dict({\n",
    "    'M': M,               \n",
    "    \"efConstruction\": efConstruction })\n",
    "index_params = {\n",
    "    \"index_type\": \"HNSW\", \n",
    "    \"metric_type\": \"COSINE\", \n",
    "    \"params\": INDEX_PARAMS\n",
    "    }\n",
    "\n",
    "# Use no-schema Milvus client uses flexible json key:value format.\n",
    "# https://milvus.io/docs/using_milvusclient.md\n",
    "mc = MilvusClient(\n",
    "    uri=CLUSTER_ENDPOINT,\n",
    "    # API key or a colon-separated cluster username and password\n",
    "    token=TOKEN)\n",
    "\n",
    "# Check if collection already exists, if so drop it.\n",
    "has = utility.has_collection(COLLECTION_NAME)\n",
    "if has:\n",
    "    drop_result = utility.drop_collection(COLLECTION_NAME)\n",
    "    print(f\"Successfully dropped collection: `{COLLECTION_NAME}`\")\n",
    "\n",
    "# Create the collection.\n",
    "mc.create_collection(COLLECTION_NAME, \n",
    "                     EMBEDDING_DIM,\n",
    "                     consistency_level=\"Eventually\", \n",
    "                     auto_id=True,  \n",
    "                     overwrite=True,\n",
    "                     # skip setting params below, if using AUTOINDEX\n",
    "                     params=index_params\n",
    "                    )\n",
    "\n",
    "print(f\"Successfully created collection: `{COLLECTION_NAME}`\")\n",
    "print(mc.describe_collection(COLLECTION_NAME))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "73135e22",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "loaded 8 documents\n"
     ]
    }
   ],
   "source": [
    "# STEP 4. PREPARE DATA: CHUNK AND EMBED\n",
    "\n",
    "# Read docs into LangChain\n",
    "#!pip install langchain \n",
    "from langchain.document_loaders import DirectoryLoader\n",
    "\n",
    "# Load HTML files from a local directory\n",
    "path = \"../RAG/rtdocs/pymilvus.readthedocs.io/en/latest/\"\n",
    "loader = DirectoryLoader(path, glob='*.html')\n",
    "docs = loader.load()\n",
    "\n",
    "num_documents = len(docs)\n",
    "print(f\"loaded {num_documents} documents\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "26b2f7eb",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "chunk_size: 511, chunk_overlap: 51.0\n",
      "chunking time: 0.01361989974975586\n",
      "docs: 8, split into: 8\n",
      "split into chunks: 161, type: list of <class 'langchain_core.documents.base.Document'>\n",
      "\n",
      "Looking at a sample chunk...\n",
      "pymilvus latest Table of Contents Installation Installing via pip Installing in a virtual environmen\n",
      "{'h1': 'pymilvus latest Table of Contents Installation Installing via pip Installing in a virtual environmen', 'h2': 'Installing via pip', 'source': '../RAG/rtdocs/pymilvus.readthedocs.io/en/latest/install.html'}\n"
     ]
    }
   ],
   "source": [
    "from langchain.text_splitter import HTMLHeaderTextSplitter, RecursiveCharacterTextSplitter\n",
    "from bs4 import BeautifulSoup\n",
    "\n",
    "# Define the headers to split on for the HTMLHeaderTextSplitter\n",
    "headers_to_split_on = [\n",
    "    (\"h1\", \"Header 1\"),\n",
    "    (\"h2\", \"Header 2\"),\n",
    "]\n",
    "# Create an instance of the HTMLHeaderTextSplitter\n",
    "html_splitter = HTMLHeaderTextSplitter(headers_to_split_on=headers_to_split_on)\n",
    "\n",
    "# Specify chunk size and overlap.\n",
    "chunk_size = 511 \n",
    "chunk_overlap = np.round(chunk_size * 0.10, 0)\n",
    "print(f\"chunk_size: {chunk_size}, chunk_overlap: {chunk_overlap}\")\n",
    "\n",
    "# Create an instance of the RecursiveCharacterTextSplitter\n",
    "child_splitter = RecursiveCharacterTextSplitter(\n",
    "    chunk_size = chunk_size,\n",
    "    chunk_overlap = chunk_overlap,\n",
    "    length_function = len,\n",
    ")\n",
    "\n",
    "# Split the HTML text using the HTMLHeaderTextSplitter.\n",
    "start_time = time.time()\n",
    "html_header_splits = []\n",
    "for doc in docs:\n",
    "    soup = BeautifulSoup(doc.page_content, 'html.parser')\n",
    "    splits = html_splitter.split_text(str(soup))\n",
    "    for split in splits:\n",
    "        # Add the source URL and header values to the metadata\n",
    "        metadata = {}\n",
    "        new_text = split.page_content\n",
    "        for header_name, metadata_header_name in headers_to_split_on:\n",
    "            # Handle exception if h1 does not exist.\n",
    "            try:\n",
    "                header_value = new_text.split(\"¶ \")[0].strip()[:100]\n",
    "                metadata[header_name] = header_value\n",
    "            except:\n",
    "                break\n",
    "            # Handle exception if h2 does not exist.\n",
    "            try:\n",
    "                new_text = new_text.split(\"¶ \")[1].strip()[:50]\n",
    "            except:\n",
    "                break\n",
    "        split.metadata = {\n",
    "            **metadata,\n",
    "            \"source\": doc.metadata[\"source\"]\n",
    "        }\n",
    "        # Add the header to the text\n",
    "        split.page_content = split.page_content\n",
    "    html_header_splits.extend(splits)\n",
    "\n",
    "# Split the documents further into smaller, recursive chunks.\n",
    "chunks = child_splitter.split_documents(html_header_splits)\n",
    "\n",
    "end_time = time.time()\n",
    "print(f\"chunking time: {end_time - start_time}\")\n",
    "print(f\"docs: {len(docs)}, split into: {len(html_header_splits)}\")\n",
    "print(f\"split into chunks: {len(chunks)}, type: list of {type(chunks[0])}\") \n",
    "\n",
    "# Inspect a chunk.\n",
    "print()\n",
    "print(\"Looking at a sample chunk...\")\n",
    "print(chunks[0].page_content[:100])\n",
    "print(chunks[0].metadata)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "46ccee07",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "pymilvus latest Table of Contents Installation Installing via pip Installing in a virtual environmen\n",
      "{'h1': 'pymilvus latest Table of Contents Installation Installing via pip Installing in a virtual environmen', 'h2': 'Installing via pip', 'source': 'https://pymilvus.readthedocs.io/en/latest/install.html'}\n"
     ]
    }
   ],
   "source": [
    "# Clean up the metadata urls\n",
    "for doc in chunks:\n",
    "    new_url = doc.metadata[\"source\"]\n",
    "    new_url = new_url.replace(\"../RAG/rtdocs\", \"https:/\")\n",
    "    doc.metadata.update({\"source\": new_url})\n",
    "\n",
    "print(chunks[0].page_content[:100])\n",
    "print(chunks[0].metadata)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "id": "3a0f232f",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Start inserting entities\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "100%|██████████| 1/1 [00:01<00:00,  1.77s/it]\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Milvus Client insert time for 161 vectors: 1.803873062133789 seconds\n"
     ]
    }
   ],
   "source": [
    "# STEP 5. INSERT CHUNKS AND EMBEDDINGS IN ZILLIZ.\n",
    "\n",
    "# Convert chunks to a list of dictionaries.\n",
    "chunk_list = []\n",
    "for chunk in chunks:\n",
    "\n",
    "    response = openai_client.embeddings.create(\n",
    "        input=chunk.page_content,\n",
    "        model=EMBEDDING_MODEL,\n",
    "        dimensions=EMBEDDING_DIM\n",
    "    )\n",
    "    # OpenAI embeddings already normalized, use raw embedding values.\n",
    "    embeddings = response.data[0].embedding\n",
    "    \n",
    "    # Only use h1, h2. Truncate the metadata in case too long.\n",
    "    try:\n",
    "        h2 = chunk.metadata['h2'][:50]\n",
    "    except:\n",
    "        h2 = \"\"\n",
    "    # Assemble embedding vector, original text chunk, metadata.\n",
    "    chunk_dict = {\n",
    "        'vector': embeddings,\n",
    "        'chunk': chunk.page_content,\n",
    "        'source': chunk.metadata['source'],\n",
    "        'h1': chunk.metadata['h1'][:50],\n",
    "        'h2': h2,\n",
    "    }\n",
    "    chunk_list.append(chunk_dict)\n",
    "\n",
    "# Insert data into the Milvus collection.\n",
    "print(\"Start inserting entities\")\n",
    "start_time = time.time()\n",
    "insert_result = mc.insert(\n",
    "    COLLECTION_NAME,\n",
    "    data=chunk_list,\n",
    "    progress_bar=True)\n",
    "end_time = time.time()\n",
    "print(f\"Milvus Client insert time for {len(chunk_list)} vectors: {end_time - start_time} seconds\")\n",
    "\n",
    "# After final entity is inserted, call flush to stop growing segments left in memory.\n",
    "mc.flush(COLLECTION_NAME)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "4f33f663",
   "metadata": {},
   "outputs": [],
   "source": [
    "# # TODO: Uncomment to inspect a chunk row.\n",
    "# pprint.pprint(chunk_list[0])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "id": "84e20b3d",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "query length: 75\n"
     ]
    }
   ],
   "source": [
    "# Define a sample question about your data.\n",
    "QUESTION1 = \"What do the parameters for HNSW mean?\"\n",
    "QUESTION2 = \"What are good default values for HNSW parameters with 25K vectors dim 1024?\"\n",
    "QUESTION3 = \"What is the default AUTOINDEX distance metric in Milvus Client?\"\n",
    "QUERY = [QUESTION1, QUESTION2, QUESTION3]\n",
    "\n",
    "# Inspect the length of the query.\n",
    "QUERY_LENGTH = len(QUESTION2)\n",
    "print(f\"query length: {QUERY_LENGTH}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "id": "98322752",
   "metadata": {},
   "outputs": [],
   "source": [
    "# SELECT A PARTICULAR QUESTION TO ASK.\n",
    "\n",
    "SAMPLE_QUESTION = QUESTION1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "id": "65973c44",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Milvus Client search time for 161 vectors: 0.059777021408081055 seconds\n",
      "type: <class 'list'>, count: 2\n"
     ]
    }
   ],
   "source": [
    "# RETRIEVAL USING MILVUS API.\n",
    "\n",
    "# # Not needed with Milvus Client API.\n",
    "# mc.load()\n",
    "\n",
    "# Embed the question using the same encoder.\n",
    "response = openai_client.embeddings.create(\n",
    "    input=SAMPLE_QUESTION,\n",
    "    model=EMBEDDING_MODEL,\n",
    "    dimensions=EMBEDDING_DIM\n",
    ")\n",
    "query_embeddings = response.data[0].embedding\n",
    "TOP_K = 2\n",
    "\n",
    "# Return top k results with HNSW index.\n",
    "SEARCH_PARAMS = dict({\n",
    "    # Re-use index param for num_candidate_nearest_neighbors.\n",
    "    \"ef\": INDEX_PARAMS['efConstruction']\n",
    "    })\n",
    "\n",
    "# Define output fields to return.\n",
    "OUTPUT_FIELDS = [\"h1\", \"h2\", \"source\", \"chunk\"]\n",
    "\n",
    "# Run semantic vector search using your query and the vector database.\n",
    "start_time = time.time()\n",
    "results = mc.search(\n",
    "    COLLECTION_NAME,\n",
    "    data=[query_embeddings], \n",
    "    search_params=SEARCH_PARAMS,\n",
    "    output_fields=OUTPUT_FIELDS, \n",
    "    # Milvus can utilize metadata in boolean expressions to filter search.\n",
    "    # filter=\"\",\n",
    "    limit=TOP_K,\n",
    "    consistency_level=\"Eventually\"\n",
    "    )\n",
    "\n",
    "elapsed_time = time.time() - start_time\n",
    "print(f\"Milvus Client search time for {len(chunk_list)} vectors: {elapsed_time} seconds\")\n",
    "\n",
    "# Inspect search result.\n",
    "print(f\"type: {type(results[0])}, count: {len(results[0])}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "id": "7d2b6373",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Length context: 509, Number of contexts: 2\n",
      "Retrieved result #1\n",
      "distance = 0.5799391269683838\n",
      "Context: count. HNSW¶ HNSW (Hierarchical Navigable Small World Graph) is a graph-based indexing algorithm. It builds a multi-layer navigation structure for an \n",
      "Metadata: {'h1': 'pymilvus latest Table of Contents Installation Tut', 'h2': 'Milvus support to create index to accelerate vecto', 'source': 'https://pymilvus.readthedocs.io/en/latest/param.html'}\n",
      "\n",
      "Retrieved result #2\n",
      "distance = 0.5569350719451904\n",
      "Context: HNSW client create_index collection_name IndexType HNSW \"M\" 16 # int. 4~64 \"efConstruction\" 40 # int. 8~512 search parameters: ef: Take the effect in \n",
      "Metadata: {'h1': 'pymilvus latest Table of Contents Installation Tut', 'h2': 'Milvus support to create index to accelerate vecto', 'source': 'https://pymilvus.readthedocs.io/en/latest/param.html'}\n",
      "\n"
     ]
    }
   ],
   "source": [
    "# Assemble `num_shot_answers` retrieved 1st context and context metadata.\n",
    "METADATA_FIELDS = [f for f in OUTPUT_FIELDS if f != 'chunk']\n",
    "formatted_results, context, context_metadata = _utils.client_assemble_retrieved_context(\n",
    "    results, metadata_fields=METADATA_FIELDS, num_shot_answers=3)\n",
    "print(f\"Length context: {len(context[0])}, Number of contexts: {len(context)}\")\n",
    "\n",
    "# TODO - Uncomment to loop through each context and metadata and print.\n",
    "for i in range(len(context)):\n",
    "    print(f\"Retrieved result #{i+1}\")\n",
    "    print(f\"distance = {formatted_results[i][0]}\")\n",
    "    print(f\"Context: {context[i][:150]}\")\n",
    "    print(f\"Metadata: {context_metadata[i]}\")\n",
    "    print()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "id": "302341e8",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Question: What do the parameters for HNSW mean?\n",
      "('Answer: The parameters for HNSW in Milvus refer to the configuration options '\n",
      " 'for the Hierarchical Navigable Small World index. These parameters include '\n",
      " '\"M\" for the number of bi-directional links created for each element during '\n",
      " 'construction, \"efConstruction\" for the size of the dynamic list for the '\n",
      " 'nearest neighbors search, and \"ef\" for the size of the dynamic list for the '\n",
      " 'nearest neighbors search during query. These parameters help optimize the '\n",
      " 'performance of the index for efficient similarity search operations. \\n'\n",
      " 'Source: https://pymilvus.readthedocs.io/en/latest/param.html')\n",
      "\n",
      "\n"
     ]
    }
   ],
   "source": [
    "import openai, pprint\n",
    "from openai import OpenAI\n",
    "\n",
    "# Define the generation llm model to use.\n",
    "# https://openai.com/blog/new-embedding-models-and-api-updates\n",
    "# Customers using the pinned gpt-3.5-turbo model alias will be automatically upgraded to gpt-3.5-turbo-0125 two weeks after this model launches.\n",
    "LLM_NAME = \"gpt-3.5-turbo\"\n",
    "TEMPERATURE = 0.1\n",
    "RANDOM_SEED = 415\n",
    "\n",
    "# Separate all the context together by space.\n",
    "contexts_combined = ' '.join(context)\n",
    "\n",
    "SYSTEM_PROMPT = f\"\"\"Use the Context below to answer the user's question. Be clear, factual, complete, concise.\n",
    "If the answer is not in the Context, say \"I don't know\". \n",
    "Otherwise answer with fewer than 4 sentences and cite the grounding sources.\n",
    "Context: contexts_combined\n",
    "Answer: The answer to the question.\n",
    "Grounding sources: {context_metadata[0]['source']}\n",
    "\"\"\"\n",
    "\n",
    "# Generate response using the OpenAI API.\n",
    "response = openai_client.chat.completions.create(\n",
    "    messages=[\n",
    "        {\"role\": \"system\", \"content\": SYSTEM_PROMPT,},\n",
    "        {\"role\": \"user\", \"content\": f\"question: {SAMPLE_QUESTION}\",}\n",
    "    ],\n",
    "    model=LLM_NAME,\n",
    "    temperature=TEMPERATURE,\n",
    "    seed=RANDOM_SEED,\n",
    ")\n",
    "\n",
    "# Print the question and answer along with grounding sources and citations.\n",
    "print(f\"Question: {SAMPLE_QUESTION}\")\n",
    "\n",
    "# Print all choices in the response\n",
    "for i, choice in enumerate(response.choices, 1):\n",
    "    # Print the answer\n",
    "    pprint.pprint(f\"Answer: {choice.message.content}\")\n",
    "    print(\"\\n\")\n",
    "\n",
    "# Question1: What do the parameters for HNSW mean?\n",
    "# Answer: Perfect!\n",
    "# Best answer:  M: maximum degree of nodes in a layer of the graph. \n",
    "# efConstruction: number of nearest neighbors to consider when connecting nodes in the graph.\n",
    "# ef: number of nearest neighbors to consider when searching for similar vectors. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "id": "d0e81e68",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Drop collection\n",
    "utility.drop_collection(COLLECTION_NAME)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "id": "c777937e",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Author: Christy Bergman\n",
      "\n",
      "Python implementation: CPython\n",
      "Python version       : 3.11.7\n",
      "IPython version      : 8.21.0\n",
      "\n",
      "pymilvus : 2.3.6\n",
      "langchain: 0.1.5\n",
      "openai   : 1.11.1\n",
      "\n",
      "conda environment: py311\n",
      "\n"
     ]
    }
   ],
   "source": [
    "# Props to Sebastian Raschka for this handy watermark.\n",
    "# !pip install watermark\n",
    "\n",
    "%load_ext watermark\n",
    "%watermark -a 'Christy Bergman' -v -p pymilvus,langchain,openai --conda"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
